HIC: A Robust and Efficient Hyper-Image-Based Clustering for Very Large Datasets

نویسندگان

  • Kun-Che Lu
  • Don-Lin Yang
چکیده

Most existing clustering approaches not only require several scans of a dataset but also have a very high computational cost. In this paper, we propose a novel, efficient, and effective clustering framework which requires only one scan of the input dataset. In the beginning, the original dataset is transformed and merged into a hyper-image. After that, the dissimilarities between data points are measured, once and for all, by using various image-processing methodologies. Then, image segmentation techniques are applied to extract clusters from the hyper-image. The resulting clusters can be further processed to achieve fuzzy and/or hierarchical clustering effects. Moreover, the proposed framework can cluster incrementally and even dynamically with only one scan of the updated records. With this capability, it can also be used to effectively cluster streaming data. Experimental results show that our approach is robust and stable under various parameter settings and data distributions, and it is more powerful and sophisticated than other methodologies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A stack-based chaotic algorithm for encryption of colored images

In this paper, a new method is presented for encryption of colored images. This method is based on using stack data structure and chaos which make the image encryption algorithm more efficient and robust. In the proposed algorithm, a series of data whose range is between 0 and 3 is generated using chaotic logistic system. Then, the original image is divided into four subimages, and these four i...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

An improved opposition-based Crow Search Algorithm for Data Clustering

Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...

متن کامل

An Improved SSPCO Optimization Algorithm for Solve of the Clustering Problem

Swarm Intelligence (SI) is an innovative artificial intelligence technique for solving complex optimization problems. Data clustering is the process of grouping data into a number of clusters. The goal of data clustering is to make the data in the same cluster share a high degree of similarity while being very dissimilar to data from other clusters. Clustering algorithms have been applied to a ...

متن کامل

Intelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms

Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Inf. Sci. Eng.

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2010